Supporting Code Clone Inspection using Parameterized Clone Pattern

نویسنده

  • Udo Borkowski
چکیده

Code clone inspection is an integral part of software clone management to assess the quality of clones or the tools reporting them, to decide how to resolve code clone issues, and so on. As clone inspection is a manual process its feasibility is limited especially when working with large numbers of clones. This is rather critical as clone detection tools may return many clones even when applied to medium sized projects. Parameterized Clone Pattern abstracts clones using parameters and merge them into shared “patterns”. The more abstract view on clones reduces the number of necessary manual clone inspections. In addition this approach can be used to provide intuitive visualizations of clones, report potential problems, suggest possible next steps etc.. This makes Parameterized Clone Pattern a good candidate to improve clone inspection efficiency. Introduction Code inspection is a rather time consuming task. Bellon [1] mentioned it took him 40 sec in average to check a clone pair when working on the „Bellon Clone Detection Benchmark“ [2]. Considering the benchmark contains more than 300.000 clone pairs it would have taken 2 person years to inspect all of them. For this reason Bellon only inspected 2 % of them. There seem to be two options to improve this situation: reduce the time per clone pair inspection or reduce the number of clone pairs to inspect manually. The concept of PCP introduced in this paper uses both options. Parameterized Clone Pattern Parameterized Clone Pattern (PCP, or shorter „clone pattern“) is an attempt to make Ira Baxter's definition of code clones as „segments of code that are similar according to some definition of similarity“ more concrete. Definition: Parameterized Clone Pattern A PCP is a function P(p1, ..., pn) [n >= 0] returning a code fragment. pi is called a parameter of P. Definition: Clone Class A clone ci is part of the clone class P iff there is a PCP P(p1, ..., pn) and arguments a1,...an such that ci is equal to P(a1, ..., an). The definition of „equal“ depends on the context. E.g. whitespaces or comments may be ignored when comparing. Assuming two things are similar if they have something in common but may vary in certain aspects a PCP defines the commonality between clones while the parameters provide a way to express the variability. Applying the PCP with specific arguments produces instances of the PCP (i.e. code fragments or „clones“). This definition of PCP covers all types of clones [3]: • exact clones are parameterless PCPs, • for parameter-substituted clones the parameters correspond directly to identifiers or literals being substituted. • for structure-substituted clones the parameters match the text of the subtrees being replaces, and • for non-contiguous clones extra parameters are introduced to represent the „gaps“. Find Clone Pattern using Clone Detection Code clone detection can be used to find new PCPs in an incremental, interactive way. We assume a clone detection tool returns its result as a set Clones = {cl1, ..., clk} with each cli („clone list“) representing a list of code fragments. If |cli| = 2 the clone list is called a clone pair. All elements in cli are considered to be clones to each other. Therefore it should be possible to find a corresponding PCP. For a given clone list cloneList the following algorithm can be used to find its clone pattern(s). Algorithm: findClonePattern

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Clone Detection to Manage a Product Line

Clone detection finds code in large software systems that has been replicated and modified by hand. Remarkably, clone detection works because people copy conceptually identifiable blocks of code, and make only a few changes, which means the same syntax is detectably repeated. Each identified clone thus indicates the presence of a useful problem domain concept, and simultaneously provides an exa...

متن کامل

Incremental Detection of Parameterized Code Clones

This paper presents a new approach to incremental code clone detection that is based on a special clone representation model. The algorithm detects parameterized clones with parameters of arbitrary size. It can be used for cross-system and cross-revision clone detection.

متن کامل

Detection and evolution analysis of code clones for efficient management of large-scale software systems

In recent decades, large-scale software systems have become mainstream. Such software systems have complicated the maintenance process by increasing efforts such as inspection and understanding of the existing source code. Therefore, to maintain these systems, a great deal of work and time are necessary. To alleviate this problem, this research focus on a well-known factor hindering the softwar...

متن کامل

An Investigation into the Characteristics of Merged Code Clones during Software Evolution

Although code clones (i.e. code fragments that have similar or identical code fragments in the source code) are regarded as a factor that increases the complexity of software maintenance, tools for supporting clone refactoring (i.e. merging a set of code clones into a single method or function) are not commonly used. To promote the development of refactoring tools that can be more widely utiliz...

متن کامل

Mining of Source Code Concepts and Idioms An Approach based on Clone Detection Techniques

This paper introduces a new view on program source code with a focus on code clone information. An algorithm is presented that transforms source code into an equivalent representation which expresses code redundancies as hierarchical clone classes explicitly. This representation supports program comprehension by pointing out arbitrary programming idioms and the frequencies of their occurrences ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Softwaretechnik-Trends

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2009